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DETAILED ACTION 



Response to Amendment 



This communication is in response to the Amendment filed 8/14/2007. 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



1. Claims 1-3,5-6,9,11-12,14,16,26-37,39-42 are rejected under 35 U.S.C. 103(a) 
as being unpatentable over Rubin et al ("Rubin", US 2002/0099552) in view of Heck et 
al ("Heck" "A Survey of Web Annotation Systems"), Mitchell et al ("Mitchell", US 
5,857,099) and Balabanovic ("Balabanovic, "Multimedia Chronicles for Business 



Communications"). 
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As per independent claim 1, Rubin discloses an apparatus for direct annotation 
of objects, the apparatus comprising: a display device for displaying one or more 
images (Figure 1 item 107); an audio input device for receiving an audio signal (Figure 1 
item 118); a storage device for storing a plurality of visual notations ([0032]) plurality of 
different visual notations being text or a graphic image (Figure 4 item 401 ) and a direct 
annotation creation module coupled to receive an input audio signal from the audio 
input device and to receive a reference to a location within an image from the display 
device (Figure 4, [0048] lines 1-6, [0066] lines 1-3), the direct annotation creation 
module , in response to receiving the input audio signal and the reference to the 
location within the image (Figure 4), automatically creating an annotation object, 
independent from the image, that associates the input audio signal with the location 
(Figure 4). Rubin fails to distinctly point out having a plurality of different visual 
notations. However, Heck teaches a plurality of different visual notations (Page 2 lines 
25-28). Therefore it would have been obvious to an artisan at the time of the invention 
to combine the different notations of Heck with the apparatus of Rubin. Motivation to do 
so would have been to provide a distinguishable mark to decipher between annotations. 
Rubin-Heck fails to distinctly point out details of the audio voice recognition technology. 
However, Mitchell discloses an apparatus further comprising: an audio vocabulary 
storage for storing a plurality of audio signals and corresponding text strings (Mitchell, 
Figure 2 item 20); an audio vocabulary comparison module coupled to the audio input 
device (Column 5 lines 25-65), Therefore it would have been obvious to an artisan at 
the time of the invention to combine the vocabulary-based audio conversion of Mitchell 
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with the system of Rubin-Heck. Motivation to do so would have been to provide a 
convenient way to distinguish voice patterns for recognition. Rubin-Heck-Mitchell fails to 
distinctly point out corresponding a visual notation with an audio input. However, 
Balabanovic teaches corresponding one of the plurality of visual notations that matches 
the audio input (Figure 5 page 6). Therefore it would have been obvious to an artisan at 
the time of the invention to combine the teaching of Balabanovic with the system of 
Rubin-Heck-Mitchell. Motivation to do so would have been to provide a clear indication 
of who made the annotation so that if a user cannot recognize the voice associated with 
the annotation a corresponding picture would clarify. 

As per claim 2, which is dependent on claim 1, the modified Rubin discloses the 
apparatus of claim 1 further comprising an annotation display module coupled to the 
direct annotation creation module, the annotation display module generating symbols or 
text representing the annotation objects (Rubin, Figure 4 item 401). 

As per claim 3, which is dependent on claim 1, the modified Rubin discloses an 
annotation audio output module coupled to the direct annotation creation module, the 
annotation audio output module generating audio output in response to user selection of 
an annotation symbol representing an annotation object (Rubin, [0102] lines 1-25, 
[0132] lines 1-9). 



Application/Control Number: 10/043,575 Page 5 

Art Unit: 2174 

As per claim 5, which is dependent on claim 1, the modified Rubin discloses the 
apparatus further comprising: an audio vocabulary storage for storing a plurality of audio 
signals and corresponding text strings (Mitchell, Figure 1 item 20); a dynamic 
vocabulary updating module coupled to the audio vocabulary storage and the audio 
input device (Mitchell, Column 5 lines 25-65), the dynamic vocabulary updating module 
for displaying an interface to create a new entry in the audio vocabulary storage 
(Mitchell, Column 5 lines 25-65), the dynamic vocabulary updating module receiving an 
audio input and a text string and creating the new entry in the audio vocabulary storage 
that includes a new visual annotation (Mitchell, Column 5 lines 25-65, Column 9 lines 
45-56). 

As per claim 6, which is dependent on claim 1 , the modified Rubin discloses the 
apparatus of claim 1 further comprising a media object cache for storing media and 
annotation objects ([0046] -[0048]). 

Claim 9 is similar in scope to that of claim 1 , and is therefore rejected under 
similar rationale. 



As per claim 26, the modified Rubin discloses a method for direct annotation of 
objects, the method comprising the steps of: displaying an image (Rubin, Figure 4); 
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receiving audio input (Rubin, [0041] lines 1-23); detecting selection of a location within 
the image (Rubin, [0086] -[0087]); comparing the audio input to a vocabulary to 
produce text or a graphic image (Mitchell, Column 5 lines 25-65, Balabanovic, Page 6 
Figure 5); and creating an annotation that provides association between the image, the 
audio input, the selected location, one of a plurality of different visual notations 
comprising text or a graphic image (Rubin, [0097] lines 1-9, Heck, page 2 lines 25.-34, 
Balabanovic, Figure 5), the annotation object including at least an audio input field, an 
image reference field, and an annotation location field ([0046]-[0049]), the creating step 
occurring automatically in response to the receiving or detecting steps (Rubin, [0129]). 

As per claim 27, which is dependent on claim 26, the modified Rubin discloses 
further comprising the step of recording the audio input received (Rubin, [0041] lines 1- 

23). 

As per claim 28, which is dependent on claim 27, the modified Rubin discloses 
the method, wherein the step of creating an annotation includes creating an annotation 
object including a reference to the selected image, the recorded audio input and one of 
the plurality of different annotations (Rubin, [0086] -[0087], Balabanovic, Figure 5 
column 9 lines 37-52), and storing the annotation object in an object storage (Rubin, 
[0046] -[0049]). 
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As per claim 29, which is dependent on claim 26, the modified Rubin discloses 
the method, wherein the step of creating an annotation includes creating an annotation 
object and storing the text as part of the annotation object (Rubin, [0086] -[0087]). 

As per claim 30, which is dependent on claim 26, the modified Rubin discloses a 
method further comprising the steps of determining if the audio input has a matching 
entry in the vocabulary (Mitchell, Column 5 lines 25-65); and storing the entry as part of 
the annotation object if the audio input has a matching entry in the vocabulary (Rubin, 
[0046] -[0049]). 

As per claim 31 , which is dependent on claim 29, the modified Rubin discloses a 
method, further comprising the steps of: determining if the audio input has a close 
match in the vocabulary; displaying the close matches; receiving input selecting a close 
match (Mitchell, Column 9 lines 45-65); and storing the selected close match as part of 
the annotation object if the audio input has a close match in the vocabulary (Mitchell, 
Column 9 lines 45-65). 

As per claim 32, which is dependent on claim 30, the modified Rubin discloses 
the method, further comprising the step of displaying a message that the image has not 
been annotated if there is neither a matching entry in the vocabulary nor a close match 
in the vocabulary (Mitchell, Column 9 lines 45-56). 
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As per claim 33, which is dependent on claim 30, the modified Rubin discloses a 
method, further comprising the following steps if there is neither a matching entry in the 
vocabulary nor a close match in the vocabulary: receiving text input corresponding to 
the audio input (Mitchell, Column 9 lines 45-56); updating the vocabulary with a new 
entry including the audio input and the text input (Mitchell, Column 5 lines 25-65, 
Column 9 lines 45-65); and wherein the received text is stored as part of the annotation 
object (Rubin, [0046] -[0049]). 

As per claim 34, which is dependent on claim 26, the modified Rubin discloses a 
method, further comprising the steps of: receiving text input corresponding to the audio 
input (Mitchell, Column 9 lines 45-56); updating the vocabulary with a new entry 
including the audio input and the text input (Mitchell, Column 5 lines 25-65, Column 9 
lines 45-65). 

As per independent claim 35, the modified Rubin discloses a computer 
implemented method for displaying objects with annotations, the method comprising the 
steps of: retrieving an image (Rubin, Figure 4); displaying the image with a plurality of 
different visual notations (Heck, Page 2 lines 25-28, Balabanovic, Figure 5) that an 
annotation exists (Rubin, Figure 4 item 401); receiving user selection of the visual 
notation (Rubin, [0086] lines 1-15, [102] lines 1-25); generating the annotation 
automatically, in response to user input of a location within the image and an audio 
input, and outputting the annotation associated with the selected visual notation (Rubin, 



Application/Control Number: 10/043,575 Page 9 

Art Unit: 2174 

[0080] lines 1-17, [0086] lines 1-15, Balabanovic, Figure 5) determining whether the 
annotation includes text; retrieving a text annotation for the selected visual notation; and 
displaying the retrieved text with the image (Rubin, [0086] lines 1-15). 

As per claim 36, which is dependent on claim 35, the modified Rubin discloses a 
method wherein the annotation is text and the step of outputting is displaying the text 
proximate the image that it annotates (Rubin, [0086] lines 1-15). 

As per claim 37, which is dependent on claim 35, the modified Rubin discloses a 
method wherein the annotation is an audio signal and the step of outputting is playing 
the audio signal (Rubin, [0086] lines 1-15). 

As per claim 39, which is dependent on claim 35, the modified Rubin discloses a 
method further comprising the steps of: determining whether the annotation includes an 
audio signal; retrieving a audio signal for the selected image; and wherein the step of 
outputting is playing the audio signal (Rubin, [0086] lines 1-15). 

As per independent claim 40, the modified Rubin discloses a method for 
retrieving images, the method comprising the steps of: receiving audio input (Rubin, 
[0096] -[0099]); determining annotation objects that reference a close match to the 
audio input (Rubin, [0096] -[0099]); automatically creating an annotation object, 
independent of the image (Figure 4, [0123] - [0132]), that associates the input audio 
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signal and the location, and the direct annotation creation module automatically 
terminating a recording of the input audio signal based on a predetermined audio level 
([0080] lines 1-17); retrieving the images that are referenced by the determined 
annotation objects (Rubin, [0096] -[0099]); and displaying the retrieved images (Rubin, 
[0096] -[0099], Balabanovic Figure 5), one of a plurality of different visual notations for 
the annotation object (Heck, Page 2 lines 25-34, Balabanovic Figure 5)and wherein the 
annotation object includes at least an audio input field, an image reference field, and 
annotation location field (Rubin, [0046]-[0049]) 

As per claim 41 , which is dependent on claim 40, the modified Rubin discloses a 
method wherein the step of determining annotation objects further comprising the steps 
of: comparing the audio input to an audio signal reference by an annotation object 
(Column 5 lines 30-35); and determining a close match between the audio input to the 
audio signal reference by an annotation object if a probability metric is greater than a 
threshold (Column 5 lines 35-38). Rubin-Heck-Mitchell-Balabanovic fails to disclose a 
threshold of 80%. However, Official Notice is taken that a threshold of 80% is well 
known in the art, 80% is not a definitive threshold, and could be replaced by any other 
value. Therefore it would have been obvious to combine the method of Rubin-Heck- 
Mitchell-Balabanovic with the current teaching. Motivation to do so would have been to 
provide a standard of matching. 
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As per claim 42, which is dependent on claim 40, the modified Rubin discloses a 
method wherein the step of determining annotation objects further comprising the steps 
of: determining the annotation objects for a plurality of images; for each annotation 
object, comparing the audio input to an audio signal reference by an annotation object 
(Column 5 lines 30-35); and determining a close match between the audio input to the 
audio signal reference by an annotation object if a probability metric is greater than an a 
threshold (Column 5 lines 35-38). Rubin-Heck-Mitchell-Balabanovic fails to disclose a 
threshold of 80%. However, Official Notice is taken that a threshold of 8,0% is well 
known in the art, 80% is not a definitive threshold, and could be replaced by any other 
value. Therefore it would have been obvious to combine the method of Rubin-Heck- 
Mitchell-Balabanovic with the current teaching. Motivation to do so would have been to 
provide a standard of matching. 

As per claim 1 1 , which is dependent on claim 26, the modified Rubin discloses a 
method where the step of displaying is performed before or simultaneously with the step 
of receiving (Rubin, [0127] -[0129]). 

As per claim 12, which is dependent on claim 26, the modified Rubin discloses a 
method wherein the step of receiving is performed before or simultaneously with the 
step of displaying (Rubin, [010ip. 
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As per claim 14, which is dependent on claim 26, the modified Rubin teaches a 
method further comprising the step of displaying the one of the plurality of different 
visual notations to indicate that the image has an annotation (Rubin, Figure 4 item 401). 

As per claim 15, which is dependent on claim 14, the modified Rubin teaches 
one of the plurality of different visual notations being text or a symbol (Rubin, Figure 4 
item 401). 

As per claim 16, which is dependent on claim 26, the modified Rubin discloses a 
method wherein the step of creating an annotation includes creating an annotation 
object and storing the annotation object in an object storage (Rubin, [0046]-[0049]). 



Response to Arguments 

Applicant's arguments with respect to claims 1-3,5-6,9,11-12,14,16,26-37,39-42 
have been considered but are moot in view of the new ground(s) of rejection. 
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Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Ryan F Pitaro whose telephone number is 571-272- 
4071 . The examiner can normally be reached on 7:00am - 4:30pm M-Th, and 
alternating F. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Kristine Kincaid can be reached on 571-272-4063. The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 

Patent Application Information Retrieval (PAIR) system. Status information for 

published applications may be obtained from either Private PAIR or Public PAIR. 

Status information for unpublished applications is available through Private PAIR only. 

For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 

you have questions on access to the Private PAIR system, contact the Electronic 

Business Center (EBC) at 866-217-9197 (toll-free). 

/Sy D. Luu/ 
Sy D. Luu 
Primary Examiner 

Ryan Pitaro 
Patent Examiner 
Art Unit 2174 
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